NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data

Jiang, Hanwen; Xu, Zexiang; Xie, Desai; Chen, Ziwen; Jin, Haian; Luan, Fujun; Shu, Zhixin; Zhang, Kai; Bi, Sai; Sun, Xin; et al (June 2025, IEEE/CVF International Conference on Computer Vision)

Free, publicly-accessible full text available June 1, 2026
Temporally Consistent Relighting for Portrait Videos

Chandran, Sreenithy; Hold-Geoffroy, Yannick; Sunkavalli, Kalyan; Shu, Zhixin; Jayasuriya, Suren (January 2022, Workshop on Applications of Computational Imaging co-located with the IEEE/CVF Winter Conference on Applications of Computer Vision)

Ensuring ideal lighting when recording videos of people can be a daunting task requiring a controlled environment and expensive equipment. Methods were recently proposed to perform portrait relighting for still images, enabling after-the-fact lighting enhancement. However, naively applying these methods on each frame independently yields videos plagued with flickering artifacts. In this work, we propose the first method to perform temporally consistent video portrait relighting. To achieve this, our method optimizes end-to-end both desired lighting and temporal consistency jointly. We do not require ground truth lighting annotations during training, allowing us to take advantage of the large corpus of portrait videos already available on the internet. We demonstrate that our method outperforms previous work in balancing accurate relighting and temporal consistency on a number of real-world portrait videos
more » « less
Full Text Available
DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks

https://doi.org/10.1109/ICCV.2019.00022

Das, Sagnik; Ma, Ke; Shu, Zhixin; Samaras, Dimitris; Shilkrot, Roy (October 2019, IEEE/CVF International Conference on Computer Vision)

Capturing document images with hand-held devices in unstructured environments is a common practice nowadays. However, “casual” photos of documents are usually unsuitable for automatic information extraction, mainly due to physical distortion of the document paper, as well as various camera positions and illumination conditions. In this work, we propose DewarpNet, a deep-learning approach for document image unwarping from a single image. Our insight is that the 3D geometry of the document not only determines the warping of its texture but also causes the illumination effects. Therefore, our novelty resides on the explicit modeling of 3D shape for document paper in an end-to-end pipeline. Also, we contribute the largest and most comprehensive dataset for document image unwarping to date – Doc3D. This dataset features multiple ground-truth annotations, including 3D shape, surface normals, UV map, albedo image, etc. Training with Doc3D, we demonstrate state-of-the-art performance for DewarpNet with extensive qualitative and quantitative evaluations. Our network also significantly improves OCR performance on captured document images, decreasing character error rate by 42% on average. Both the code and the dataset are released.
more » « less
Full Text Available

Search for: All records